-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[clickhouse] Rework ClickHouse Attributes to look up type for string attributes #7815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[clickhouse] Rework ClickHouse Attributes to look up type for string attributes #7815
Conversation
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7815 +/- ##
==========================================
+ Coverage 95.44% 95.51% +0.07%
==========================================
Files 305 312 +7
Lines 16192 16601 +409
==========================================
+ Hits 15454 15857 +403
- Misses 577 579 +2
- Partials 161 165 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Metrics Comparison SummaryTotal changes across all snapshots: 0 Detailed changes per snapshotsummary_metrics_snapshot_cassandra📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_cassandra📊 Metrics Diff SummaryTotal Changes: 0
summary_metrics_snapshot_cassandra📊 Metrics Diff SummaryTotal Changes: 0
|
|
|
||
| // getAttributeMetadata retrieves the types stored in ClickHouse for string attributes. | ||
| // | ||
| // The query service forwards all attribute filters as strings (via AsString()), regardless |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that true? iirc the UI sends the attributes as JSON, which mean at least numeric types might be coming not as strings, but then there is another translation layer from HTTL handler into AttributeMap in the QueryService
| for key, attr := range query.Attributes.All() { | ||
| var attrType string | ||
| var val any | ||
| attributeMetadata, err := r.getAttributeMetadata(query.Attributes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need to introduce some cache TTL parameter such that we don't keep running this query on every other query. The metadata doesn't change frequently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this cache TTL parameter be exposed in the config? I can do this in following a PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it should be configurable but with sensible default
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
) ## Which problem is this PR solving? - Towards #7134 ## Description of the changes - Update the ClickHouse test driver which is used in unit tests to handle multiple queries. This is done in preparation for #7815 which adds an additional query to the attribute_metadata table. ## How was this change tested? - CI and unit tests ## Checklist - [x] I have read https://github.com/jaegertracing/jaeger/blob/master/CONTRIBUTING_GUIDELINES.md - [x] I have signed all commits - [x] I have added unit tests for the new functionality - [x] I have run lint and test steps successfully - for `jaeger`: `make lint test` - for `jaeger-ui`: `npm run lint` and `npm run test` --------- Signed-off-by: Mahad Zaryab <[email protected]>
| func (r *Reader) buildStringAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value, metadata attributeMetadata) error { | ||
| levelTypes, ok := metadata[key] | ||
|
|
||
| // if no metadata found, assume string type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yurishkuro what're your thoughts on this?
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
Signed-off-by: Mahad Zaryab <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR reworks the ClickHouse reader to support querying attributes that are provided as strings but may be stored with different types in ClickHouse. The query APIs only accept string values for tag filters, so the implementation looks up the actual stored types from an attribute_metadata table to correctly query the typed columns.
Changes:
- Added attribute metadata model and querying functionality to determine stored types for string attributes
- Refactored attribute condition building into separate functions for each type
- Updated integration tests to reflect newly enabled tag-based query tests
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| internal/storage/v2/clickhouse/tracestore/dbmodel/attribute_metadata.go | New model representing attribute metadata from ClickHouse |
| internal/storage/v2/clickhouse/tracestore/attribute_metadata.go | Implementation of metadata retrieval and query building |
| internal/storage/v2/clickhouse/tracestore/attribute_metadata_test.go | Tests for attribute metadata error cases |
| internal/storage/v2/clickhouse/tracestore/reader.go | Refactored attribute filtering to look up types for string attributes and build typed conditions |
| internal/storage/v2/clickhouse/tracestore/reader_test.go | Updated tests to include attribute metadata queries and new test cases |
| internal/storage/v2/clickhouse/sql/queries.go | Added SelectAttributeMetadata SQL query |
| cmd/jaeger/internal/integration/clickhouse_test.go | Updated skip list to enable previously failing tag-based tests |
| internal/storage/v2/clickhouse/tracestore/snapshots/*.sql | SQL query snapshots for test verification |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| decoded, err := base64.StdEncoding.DecodeString(attr.Str()) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to decode bytes attribute %q: %w", key, err) | ||
| } | ||
| val = string(decoded) |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bytes attribute handling is incorrect. After decoding the base64 string to bytes, the code converts the raw bytes to a string using string(decoded), which can produce invalid UTF-8 and doesn't match the storage format. The database stores bytes attributes as base64-encoded strings (see line 377 in buildBytesAttributeCondition). The decoded bytes should be re-encoded to base64 before storing in val, or simply use the original base64 string directly without decoding.
| decoded, err := base64.StdEncoding.DecodeString(attr.Str()) | |
| if err != nil { | |
| return fmt.Errorf("failed to decode bytes attribute %q: %w", key, err) | |
| } | |
| val = string(decoded) | |
| // Bytes attributes are stored in the database as base64-encoded strings. | |
| // Use the original base64 string directly to match the storage format. | |
| val = attr.Str() |
| @@ -0,0 +1,22 @@ | |||
| // Copyright (c) 2026 The Jaeger Authors. | |||
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.
| // Copyright (c) 2026 The Jaeger Authors. | |
| // Copyright (c) 2025 The Jaeger Authors. |
| @@ -0,0 +1,84 @@ | |||
| // Copyright (c) 2026 The Jaeger Authors. | |||
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.
| // Copyright (c) 2026 The Jaeger Authors. | |
| // Copyright (c) 2025 The Jaeger Authors. |
| @@ -0,0 +1,69 @@ | |||
| // Copyright (c) 2026 The Jaeger Authors. | |||
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The copyright year is set to 2026, but the current year is 2025. This should be corrected to 2025.
| // Copyright (c) 2026 The Jaeger Authors. | |
| // Copyright (c) 2025 The Jaeger Authors. |
| func buildBoolAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) { | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.bool_attributes.key, s.bool_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_bool_attributes.key, s.resource_bool_attributes.value)") | ||
| *args = append(*args, key, attr.Bool(), key, attr.Bool()) | ||
| } | ||
|
|
||
| func buildDoubleAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) { | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.double_attributes.key, s.double_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_double_attributes.key, s.resource_double_attributes.value)") | ||
| *args = append(*args, key, attr.Double(), key, attr.Double()) | ||
| } | ||
|
|
||
| func buildIntAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) { | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.int_attributes.key, s.int_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_int_attributes.key, s.resource_int_attributes.value)") | ||
| *args = append(*args, key, attr.Int(), key, attr.Int()) | ||
| } | ||
|
|
||
| func buildBytesAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) { | ||
| attrKey := "@bytes@" + key | ||
| val := base64.StdEncoding.EncodeToString(attr.Bytes().AsRaw()) | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)") | ||
| *args = append(*args, attrKey, val, attrKey, val) | ||
| } | ||
|
|
||
| func buildSliceAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) error { | ||
| attrKey := "@slice@" + key | ||
| b, err := marshalValueForQuery(attr) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to marshal slice attribute %q: %w", key, err) | ||
| } | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)") | ||
| *args = append(*args, attrKey, b, attrKey, b) | ||
| return nil | ||
| } | ||
|
|
||
| func buildMapAttributeCondition(q *strings.Builder, args *[]any, key string, attr pcommon.Value) error { | ||
| attrKey := "@map@" + key | ||
| b, err := marshalValueForQuery(attr) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to marshal map attribute %q: %w", key, err) | ||
| } | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.complex_attributes.key, s.complex_attributes.value)") | ||
| q.WriteString(" OR ") | ||
| q.WriteString("arrayExists((key, value) -> key = ? AND value = ?, s.resource_complex_attributes.key, s.resource_complex_attributes.value)") | ||
| *args = append(*args, attrKey, b, attrKey, b) | ||
| return nil | ||
| } |
Copilot
AI
Jan 19, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-string attribute condition builders (buildBoolAttributeCondition, buildDoubleAttributeCondition, buildIntAttributeCondition, buildBytesAttributeCondition, buildSliceAttributeCondition, buildMapAttributeCondition) only check span and resource levels, but not scope level. However, buildStringAttributeCondition supports scope level attributes (lines 482-489). This creates an inconsistency where scope-level attributes can only be queried if they arrive as strings and have metadata. Consider adding scope-level support to all attribute type conditions for consistency.
Which problem is this PR solving?
Description of the changes
How was this change tested?
Checklist
jaeger:make lint testjaeger-ui:npm run lintandnpm run test